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Abstract 

It is proved that the only additive and isotropic information measure that can depend 
on the probability distribution and also on its first derivative is a linear combination of 
the Boltzmann-Gibbs-Shannon and Fisher information measures. Power law equilibrium 
distributions are found as a result of the interaction of the two terms. The case of second 
order derivative dependence is investigated and a corresponding additive information measure 
is given. 
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I. INTRODUCTION 



In 1957 in his famous paper on information theory and statistical mechanics, Jaynes 
suggested to look statistical mechanics as a form of statistical inference. He argued 
that the usual rules of statistical mechanics are justified independently of experimental 
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verification and additional physical arguments, because "they still represent t 
estimates that could have been made on the basis of information available" 
recognized the physical importance of the train of thought of Shannon, where it was 
proved that the logarithmic form of the information measure is a consequence of some 
simple properties that any information measure should have. Based on this observation, 
he suggested to start statistical physics from a maximum entropy principle. 

Dealing with a discrete probability space where a variable x can assume the discrete 
values ( ) with the corresponding probabilities (pi, ...,p n ) there exist a function 

H(p!, ...,p n ), which is 

- continuous, 

- with equal probabilities is monotone increasing with n, 

- satisfies the composition law. 

These conditions are those that we would expect from a measure of information. With 
the above properties the function H (pi, ...,p n ) = — k$^=iPi m Pi is unique up to the 
positive multiplier k. 

Later these conditions were investigated, generalized and clarified extensively both 
from mathematical and from physical points of view. It turned out that continuity is 
not so important. Monotonicity can be replaced by concavity and determines the sign 
of the function only. The most important assumption from a physical point of view is 
the composition law. Later the composition law was reformulated in a more convenient 
way as additivity. Renyi recognized that the only possibility to find a different additive 
measure of information is to generalize also the method of averaging Q]. 

The uniqueness is the key feature that connect the information measure to the 
physical entropy and ensures that the consequences of the macroscopic Second Law 
(first of all the existence of the universal, absolute temperature) are valid for the 
quantities of the statistical physical approach. Because of the uniqueness, the arising 
statistics is the same, independently of the microscopic dynamics. 



As additivity gives the connection between the statistical and thermodynamic theo- 
ries, its investigation is particularly interesting if one would like to explain phenomena 
;hat is seemingly out of the framework of traditional methods of statistical physics 

□ □□□ 

In all previously mentioned researches it was assumed explicitly that the entropy is 
a local function of the variables. In this paper we weaken this assumption and look for 
additive entropy functions interpreted on a continuous state space, that depend not only 
on the probability distribution but also on the derivatives of the mentioned probability 
distribution. In the following we will prove generalizations of the statement of Shannon 
for derivative dependent information measures. In the next section we prove, that the 
unique additive, continuously differentiable entropy/information measure that depends 
only on the first derivatives of the probability distribution is a linear combination of 
the Boltzmann-Gibbs-Shannon and the Fisher information measures. In the third 
section we investigate the physical meaning of the constructed unique information 
measure. Then we construct an additive information measure that contains second 
order derivatives. Finally there are some conclusions. 

II. WEAKLY NONLOCAL INFORMATION MEASURES: FIRST ORDER 

Let us consider an n dimensional continuous probability space Id", where the 
probability measure can be given by a continuously differentiable nonnegative function, 
the probability density / : W a >— > R + , which is normalized, 

/ f{x)dx = I. (11.1) 
Jx 

A first order weakly nonlocal information measure is a function s(f, Df) of the prob- 
ability density / and its derivative Df with some expected properties. An information 
measure is positive, increases with increasing uncertainty, and is additive for indepen- 
dent sources of uncertainty. In case of derivative dependent information measures it 
is convenient to require the isotropy of s, too. These conditions can be formulated as 
follows 

1. Isotropy. An isotropic function s of / and Df has the following form 

s(f,Df) = s(f,(Df) 2 ). (11.2) 
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2. Additivity. For the sake of simplicity we restrict ourself for two indepen- 
dent distribution functions /i(xi) and / 2 (x 2 ) defined on X = X\ x X 2 C 
M ni x M™ 2 , n 1 + n 2 = n. The generalization to finite number of distributions is 
straightforward. Then additivity requires 

s n (fif2,D(f 1 f 2 )) = s ni (f 1 ,Df 1 ) + s na (f 2 ,Df 2 ), (II.3) 

where the subscripts denote the different dimensions of the domains. 

Without isotropy additivity cannot be formulated easily because the domain of the 
function s is the same on both sides of the above formula. Although most probability 
distributions in physics are defined on spaces that are highly anisotropic, here we 
restrict ourselves to isotropic information measures on isotropic state spaces and leave 
that problem for further investigations. Fortunately, in the simplest situation, when the 
state space is the Descartes product of isotropic subspaces (position-momentum) one 
can keep the simple formulation of additivity with some straightforward assumption. 

For independent probability distributions the unified probability density /(xi,x 2 ) 
is the product of the probability densities /i(xi) and j 2 (x 2 ) . Thus, we have 
D/(x 1 ,x 2 ) = (/2( x 2)-D X i/i( x i), /i(x 1 )D X2 / 2 (x 2 )) and omitting the variables x x and 
x 2 (Df) 2 = (f 2 Dfi) 2 + (fiDf 2 ) 2 . As a consequence, for isotropic information measures 
the additivity requirement can be written as 

s(/i/ 2 , UiDh) 2 + {hDf 2 ) 2 ) = s(f ± , (Df,) 2 ) + s(f 2 , (Df 2 ) 2 ). (II.4) 

Differentiating the above equality by (Dfi) 2 and (Df 2 ) 2 , respectively we have that 

/2Wi/2, (f2Df 1 ) 2 + (fiDf 2 ) 2 ) = d 2 s{fUDhf), 
fld 2 s{hf 2 , {f 2 Dhf + {hDhf) = d 2 s(f 2 ,(Df 2 ) 2 ). 

Here d 2 denotes the partial derivative of s by its second argument. Therefore 

f 2 d 2 s(f,(Df) 2 ) = -K 1 = const., 

hence 

sU\(Df) 2 ) = -n 1 { ^ + s(f). (II.5) 

Here s is an arbitrary function (the local part of the entropy). Repeating the above 
train of thought with the derivatives by the first argument of s, one finds that 

fdi§(f) = —k = const. 
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Consequently, s(f) = —/tin / + s , where s = by further applying additivity. 
Therefore, the most general isotropic and additive first order weakly nonlocal informa- 
tion measure is 

(Df) 2 

s(f,Df) = -K 1 { —±L-K\nf. (II.6) 

The first term has the form of a Fisher information j?| 0] and the second term has 
the form of a Shannon information measure. It is clear from the previous calculations 
that (jll.fij) is unique with the above requirements (isotropy and additivity). 

III. POWER LAW TAILS IN MICROCANONICAL AND CANONICAL EN- 
SEMBLES 

There are several attempts to find the physical significance of Fisher information 
(see e.g. 0, The observation in the previous section puts these investigations 

into a new light. Accepting Jaynes approach in suggesting the central role of informa- 
tion in statistical physics one should require the extremum of the two terms together. 
However, even if we accept the idea of Jaynes, that the unique information measures 
are important from a physical point of view there are several important questions to 
be answered. E. g. What is the physics behind the second term? What could be the 
value of the constant K\l 

Let us consider a classical one dimensional ideal gas where the Hamiltonian is given 

2 

as H(p) = where p is the momentum and m is the mass of the particles. In this 
case, according to the maximum entropy principle one should find the maximum of the 
average of the entropy density (|II.6j) subject to the constraints of fixed average energy 
and normality. Therefore we face to the following variational problem: 

- Kl ^- K/ln/ ) dp -,(/ / ^-,)- a (/^- 1 )=_ 

(mi) 

In this case the partition function formalism does no help, the above variational 
problem leads to the following Euler-Lagrange equation for R = \ff: 

AkiR" - 2nR 2 lnR + —p 2 R + (a - k)R = 0. (III.2) 
2m 

Here the dash denotes the derivation R' = ^ . The corresponding natural boundary 
condition can be interpreted as an entropy current J s = R'5R [11]. 
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For any positive K\ the solutions of the above equation are different of the classical 
Maxwell-Boltzmann distribution, but the properties of the distribution are similar. Let 

A. The symmetric solutions of the above 



us choose 7T- = 0.1, 

2ki ' 8mKi 



1 and ep 
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equation with the condition R'(0) = have finite support if a > 0.9897, the value 
where we can get back a Maxwell-Boltzmann like Gaussian distribution. Above that 
value the function have power law tail of the form Rt a u(p) — D ± Cp 1 as one can see 
from Figure n 




FIG. 1: The square root of the distribution function f(p) for different values of A = 
(1,1.1,1.2,1.3,1.5,2,5). 

We have got a one parameter family of power law tail distributions parameterized 
by the Lagrange multiplier a (A). The parameter can be regarded to different nonzero 
entropy currents at the boundary as one can see from the boundary conditions defined 
above. The arising family of power law tail distributions are different form the distri- 
butions of nonadditive (nonextensive) statistics of Tsallis, Beck-Cohen or Kaniadakis 
ij Q] and were defined by a unique and additive information measure. 

A related answer to the above questions emerges considering quantum mechanics as 
a particular application and focusing on the concavity properties of the Fisher term. 
With a suitable reinterpretation of the terms one can recognize a time independent 



Schrodinger equation of a harmonic oscillator in (jIII.2)) if k = 0. The connection 



of quantum mechanics to Fisher information was pointed out by several independent 
researches [lH Q] . We may determine the K\ constant in these systems. E.g. for a 
single particle system one can get, that K\ = h 2 . 

The understanding the role of Fisher information in quantum mechanics can give 
some clues to the further understanding the physics behind (jll.6|) . In quantum me- 
chanics only the Fisher term appears and one should explain the missing Boltzmann- 
Gibbs-Shannon term in an information theoretical approach. In this respect Hall and 
Regginato argued with strengthening some basic laws of quantum mechanics and in- 
troduced the exact uncertainty principle |3, Van and Fiilop suggested mass scale 
invariance, requiring the possibility of particle interpretation for the probability distri- 
bution 0- 

On the other hand Bialynicki-Birula and Mycielski gives an example that that the 
quantum potential could be supplemented by a Boltzmann-Gibbs-Shannon term jisj ]. 
The solution of the supplemented Schrodinger equation gives non dispersive free solu- 
tions, the so-called " Gaussons" , as a result of the additional logarithmic term. 

IV. WEAKLY NONLOCAL INFORMATION MEASURES: SECOND OR- 
DER 

One can ask about properties of information measures depending on higher order 
derivatives of the distribution function. Here we investigate the second order case. The 
requirements are similar as previously 

1. Isotropy. According to the representation theorems of isotropic functions that 
depend on a vector (Df) and a symmetric second order tensor (D 2 f) we can 
write [if] ]: 

s(f,Df,D 2 f) = 

s(f, (Df) 2 , Df ■ D 2 f ■ Df, Df ■ D 2 f ■ D 2 f ■ Df, Df ■ (D 2 f) n ~ 1 ■ Df, 

Tr{D 2 f), Tr(D 2 f • D 2 f), Tr((D 2 f) n )). 

2. Additivity. Here we require that 

s(fif 2 , D{f 1 f 2 ),D 2 {f l f 2 )) = s{h, Df,, D 2 h) + s(f 2 , Df 2 , D 2 f 2 ) 
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As previously, we need to consider isotropy in the formulation of additivity. Let us 
observe, that the above form (|IV.1J) is restricted very much by the requirement of addi- 
tivity. Simple calculations show that the second order nonlocal information measure is 
more difficult than the first order one. Its form depend on the dimension of the phase 
space. As an example I give the general additive version of the following isotropic 
function (this is the unique general form of weakly nonlocal information measure in 
three dimension) 

s 3 (f, Df, D 2 f) = s 3 (f, (Df) 2 , Df ■ D 2 f ■ Df, Df ■ (D 2 f) 2 ■ Df, Tr(D 2 f), 

Tr{D 2 f) 2 ,Tr{D 2 ff). (IV.l) 

One can derive that 
s 3 (f, Df, D 2 f) = -Klnf - K^ff- - (k 2 + k 5 ) { -^ + (k 3 + «,) {D] r 



1 (Df) 2 1 1 

^-Df-D'f-Df+^+SK^^^Df-D'f-Df-Ks-DfiD'ff-Df-^jTriD'f)- 

K5 ±Tr(D 2 f) 2 -K 6 j 3 Tr(D 2 f)\ (IV.2) 
The concavity properties of the above function are not straightforward. 



V. CONCLUSIONS 



There are several attempts to understand t 

9. 



re reason of the appearance of Fisher 



13 



20 



2l\ . The usual justification and 



information in different disciplines of physics 
interpretation is based on estimation theory. The above proof of uniqueness explains, 
why any dynamical background that preserves the additivity - gives the same Fisher 
like form, independently of the estimation theoretical background. 

We have seen that for information measures with second order derivatives the num- 
ber of additive terms depends on the dimension of the probability (phase) space. There- 
fore the physical significance of information measures containing higher than first order 
derivatives is dubious. 

There are more questions than answers in this work. However, in the relationship of 
thermodynamics and statistical physics the idea of Jaynes is the key of understanding 
that deserves further investigations. 
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